Evaluating spatial correspondence of zones in document recognition systems
نویسنده
چکیده
This paper introduces scoring methods developed to automatically assess the performance of document recognition systems; specifically, to evaluate the spatial correspondence of zones produced by a document segmentor. Two different approaches are discussed. The first approach (based on zone overlap and nearest-neighbors) is better applied to merged zones, whereas the second approach (based on zone alignments) is better applied to nested zones (such as those found in tables and graphs). Definitions of coverage and efficiency error are presented. and scoring results on real system output is provided that validates the usefulness of these methods to compare different document recognition algorithms. Currently, no standard testing procedures exist for measuring and comparing algorithms within a complex document recognition system. Scoring methods. like the ones introduced in this paper, serve as design and validations tools, expediting the development and deployment of document analysis technology for system developers and end
منابع مشابه
Automated Segmentation of Math-Zones from Document Images
With an aim to high-level understanding of the mathematical contents in a document image the requirement of math-zone extraction and recognition technique is obvious. In this paper we present fully auotmatic segmentation of displayed-math zones from the document image, using only the spatial layout information of math-formulas and equations, so as to help commercial OCR systems which cannot dis...
متن کاملFONT DISCRIMINATIO USING FRACTAL DIMENSIONS
One of the related problems of OCR systems is discrimination of fonts in machine printed document images. This task improves performance of general OCR systems. Proposed methods in this paper are based on various fractal dimensions for font discrimination. First, some predefined fractal dimensions were combined with directional methods to enhance font differentiation. Then, a novel fractal dime...
متن کاملEffects of Digital Elevation Model (DEM) Spatial Resolution on the Recognition of Physiography Characteristics of the Basin )A Case Study of Shahrchai Watershed)
In recent years with developing geographic information systems tools, modeling and simulating methods has been developed quickly. Availability of accurate base maps is the basis of the cell sizes determination and preparing digital hydrologic models. Removing errors and minimizing of uncertainty factors in the digital models play the main role in improving the accuracy of the maps. The main pur...
متن کاملDocument Analysis And Classification Based On Passing Window
In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorith...
متن کاملNeural Network Based Recognition System Integrating Feature Extraction and Classification for English Handwritten
Handwriting recognition has been one of the active and challenging research areas in the field of image processing and pattern recognition. It has numerous applications that includes, reading aid for blind, bank cheques and conversion of any hand written document into structural text form. Neural Network (NN) with its inherent learning ability offers promising solutions for handwritten characte...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1995